Using Impurity and Depth for Decision Trees Pruning

نویسنده

  • D. Fournier
چکیده

Most pruning methods for decision trees minimize a classification error rate. In uncertain domains, some subtrees which do not lessen the error rate can be relevant to point out some populations of specific interest or to give a representation of a large data file. We propose here a new pruning method (called pruning) which takes into account the complexity of sub-trees and which is able to keep sub-trees with leaves yielding to determinate relevant decision rules, even when keeping these ones does not increase the classification efficiency.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Trade-Off Between Depth and Impurity for Pruning Decision Trees

Most pruning methods for decision trees minimize a classification error rate. In uncertain domains, some sub-trees which do not lessen the error rate can be relevant to point out some populations of specific interest or to give a representation of a large data file. We propose here a new pruning method (called pruning) which takes into account the complexity of sub-trees and which is able to ke...

متن کامل

Cost-Sensitive Decision Trees with Pre-pruning

This paper explores two simple and efficient pre-pruning strategies for the cost-sensitive decision tree algorithm to avoid overfitting. One is to limit the cost-sensitive decision trees to a depth of two. The other is to prune the trees with a pre-specified threshold. Empirical study shows that, compared to the error-based tree algorithm C4.5 and several other cost-sensitive tree algorithms, t...

متن کامل

Evaluation of liquefaction potential based on CPT results using C4.5 decision tree

The prediction of liquefaction potential of soil due to an earthquake is an essential task in Civil Engineering. The decision tree is a tree structure consisting of internal and terminal nodes which process the data to ultimately yield a classification. C4.5 is a known algorithm widely used to design decision trees. In this algorithm, a pruning process is carried out to solve the problem of the...

متن کامل

Adjusting for Multiple Comparisons in Decision Tree Pruning

Introduction Bonferroni Pruning Pruning is a common technique to avoid over tting in decision trees. Most pruning techniques do not account for one important factor | multiple comparisons. Multiple comparisons occur when an induction algorithm examines several candidate models and selects the one that best accords with the data. Making multiple comparisons produces incorrect inferences about mo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001